backward knowledge transfer
Enhancing Knowledge Transfer for Task Incremental Learning with Data-free Subnetwork Qiang Gao
DSN primarily seeks to transfer knowledge to the new coming task from the learned tasks by selecting the affiliated weights of a small set of neurons to be activated, including the reused neurons from prior tasks via neuron-wise masks. And it also transfers possibly valuable knowledge to the earlier tasks via data-free replay.
- Asia > China > Sichuan Province > Chengdu (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Leisure & Entertainment (0.47)
- Information Technology > Security & Privacy (0.46)
BeyondNot-Forgetting: ContinualLearningwith BackwardKnowledgeTransfer
Forexample, regularization-based methods (e.g., [12,1,18]) penalize the modification of important weights of oldtasks; parameter-isolation based methods (e.g., [7,26,31,9])fixthemodel learnt foroldtasks; and memory-based methods (e.g., [3, 6, 25]) aim to update the model with minimal interference introduced tooldtasks. More specifically, we first introduce notions of 'sufficient projection' and 'positive correlation' based on the gradient projection onto the subspaces of old tasks to characterize the task correlation.
Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer
By learning a sequence of tasks continually, an agent in continual learning (CL) can improve the learning performance of both a new task and `old' tasks by leveraging the forward knowledge transfer and the backward knowledge transfer, respectively. However, most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks. This inevitably limits the backward knowledge transfer from the new task to the old tasks, because judicious model updates could possibly improve the learning performance of the old tasks as well. To tackle this problem, we first theoretically analyze the conditions under which updating the learnt model of old tasks could be beneficial for CL and also lead to backward knowledge transfer, based on the gradient projection onto the input subspaces of old tasks. Building on the theoretical analysis, we next develop a ContinUal learning method with Backward knowlEdge tRansfer (CUBER), for a fixed capacity neural network without data replay. In particular, CUBER first characterizes the task correlation to identify the positively correlated old tasks in a layer-wise manner, and then selectively modifies the learnt model of the old tasks when learning the new task. Experimental studies show that CUBER can even achieve positive backward knowledge transfer on several existing CL benchmarks for the first time without data replay, where the related baselines still suffer from catastrophic forgetting (negative backward knowledge transfer). The superior performance of CUBER on the backward knowledge transfer also leads to higher accuracy accordingly.
- Asia > China > Sichuan Province > Chengdu (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Leisure & Entertainment (0.47)
- Information Technology > Security & Privacy (0.46)
Post-Training Language Models for Continual Relation Extraction
Efeoglu, Sefika, Paschke, Adrian, Schimmler, Sonja
Real-world data, such as news articles, social media posts, and chatbot conversations, is inherently dynamic and non-s tationary, presenting significant challenges for constructing real-t ime structured representations through knowledge graphs (KGs). Relation Extraction (RE), a fundamental component of KG creation, often struggl es to adapt to evolving data when traditional models rely on static, out dated datasets. Continual Relation Extraction (CRE) methods tackle this is sue by in-crementally learning new relations while preserving previ ously acquired knowledge. This study investigates the application of pre-trained language models (PLMs), specifically large language models (LL Ms), to CRE, with a focus on leveraging memory replay to address cata strophic forgetting. We evaluate decoder-only models (eg, Mistral-7B and Llama2-7B) and encoder-decoder models (eg, Flan-T5 Base) on the TAC RED and FewRel datasets. Task-incremental fine-tuning of LLMs d emonstrates superior performance over earlier approaches using encode r-only models like BERT on TACRED, excelling in seen-task accuracy and overall performance (measured by whole and average accuracy), part icularly with the Mistral and Flan-T5 models. Results on FewRel are si milarly promising, achieving second place in whole and average accu racy metrics. This work underscores critical factors in knowledge transf er, language model architecture, and KG completeness, advancing CRE wit h LLMs and memory replay for dynamic, real-time relation extracti on.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (10 more...)
- North America > United States > Arizona (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
Adaptive Drift Compensation for Soft Sensorized Finger Using Continual Learning
Kushawaha, Nilay, Pathan, Radan, Pagliarani, Niccolò, Cianchetti, Matteo, Falotico, Egidio
Strain sensors are gaining popularity in soft robotics for acquiring tactile data due to their flexibility and ease of integration. Tactile sensing plays a critical role in soft grippers, enabling them to safely interact with unstructured environments and precisely detect object properties. However, a significant challenge with these systems is their high non-linearity, time-varying behavior, and long-term signal drift. In this paper, we introduce a continual learning (CL) approach to model a soft finger equipped with piezoelectric-based strain sensors for proprioception. To tackle the aforementioned challenges, we propose an adaptive CL algorithm that integrates a Long Short-Term Memory (LSTM) network with a memory buffer for rehearsal and includes a regularization term to keep the model's decision boundary close to the base signal while adapting to time-varying drift. We conduct nine different experiments, resetting the entire setup each time to demonstrate signal drift. We also benchmark our algorithm against two other methods and conduct an ablation study to assess the impact of different components on the overall performance.
Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer
By learning a sequence of tasks continually, an agent in continual learning (CL) can improve the learning performance of both a new task and old' tasks by leveraging the forward knowledge transfer and the backward knowledge transfer, respectively. However, most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks. This inevitably limits the backward knowledge transfer from the new task to the old tasks, because judicious model updates could possibly improve the learning performance of the old tasks as well. To tackle this problem, we first theoretically analyze the conditions under which updating the learnt model of old tasks could be beneficial for CL and also lead to backward knowledge transfer, based on the gradient projection onto the input subspaces of old tasks. Building on the theoretical analysis, we next develop a ContinUal learning method with Backward knowlEdge tRansfer (CUBER), for a fixed capacity neural network without data replay.